Picture for Jie Lou

Jie Lou

Your Teacher Can't Help You Here: Combating Supervision Fidelity Decay in On-Policy Distillation

Add code
May 29, 2026
Viaarxiv icon

Vision-OPD: Learning to See Fine Details for Multimodal LLMs via On-Policy Self-Distillation

Add code
May 19, 2026
Viaarxiv icon

Learning from Failures: Correction-Oriented Policy Optimization with Verifiable Rewards

Add code
May 14, 2026
Viaarxiv icon

Nemotron 3 Nano Omni: Efficient and Open Multimodal Intelligence

Add code
Apr 27, 2026
Viaarxiv icon

Nemotron 3 Super: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

Add code
Apr 14, 2026
Viaarxiv icon

Multimodal OCR: Parse Anything from Documents

Add code
Mar 13, 2026
Viaarxiv icon

Tackling Length Inflation Without Trade-offs: Group Relative Reward Rescaling for Reinforcement Learning

Add code
Mar 11, 2026
Viaarxiv icon

NVIDIA Nemotron 3: Efficient and Open Intelligence

Add code
Dec 24, 2025
Viaarxiv icon

Nemotron 3 Nano: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

Add code
Dec 23, 2025
Viaarxiv icon

Coupled Variational Reinforcement Learning for Language Model General Reasoning

Add code
Dec 14, 2025
Viaarxiv icon